28 research outputs found

    Neural Network Based Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction

    Get PDF
    This paper introduces a novel neural network-based reinforcement learning approach for robot gaze control. Our approach enables a robot to learn and to adapt its gaze control strategy for human-robot interaction neither with the use of external sensors nor with human supervision. The robot learns to focus its attention onto groups of people from its own audio-visual experiences, independently of the number of people, of their positions and of their physical appearances. In particular, we use a recurrent neural network architecture in combination with Q-learning to find an optimal action-selection policy; we pre-train the network using a simulated environment that mimics realistic scenarios that involve speaking/silent participants, thus avoiding the need of tedious sessions of a robot interacting with people. Our experimental evaluation suggests that the proposed method is robust against parameter estimation, i.e. the parameter values yielded by the method do not have a decisive impact on the performance. The best results are obtained when both audio and visual information is jointly used. Experiments with the Nao robot indicate that our framework is a step forward towards the autonomous learning of socially acceptable gaze behavior.Comment: Paper submitted to Pattern Recognition Letter

    My Corporis Fabrica Embryo: An ontology-based 3D spatio-temporal modeling of human embryo development

    Get PDF
    International audienceBackground: Embryology is a complex morphologic discipline involving a set of entangled mechanisms, sometime difficult to understand and to visualize. Recent computer based techniques ranging from geometrical to physically based modeling are used to assist the visualization and the simulation of virtual humans for numerous domains such as surgical simulation and learning. On the other side, the ontology-based approach applied to knowledge representation is more and more successfully adopted in the life-science domains to formalize biological entities and phenomena, thanks to a declarative approach for expressing and reasoning over symbolic information. 3D models and ontologies are two complementary ways to describe biological entities that remain largely separated. Indeed, while many ontologies providing a unified formalization of anatomy and embryology exist, they remain only descriptive and make the access to anatomical content of complex 3D embryology models and simulations difficult. Results: In this work, we present a novel ontology describing the development of the human embryology deforming 3D models. Beyond describing how organs and structures are composed, our ontology integrates a procedural description of their 3D representations, temporal deformation and relations with respect to their developments. We also created inferences rules to express complex connections between entities. It results in a unified description of both the knowledge of the organs deformation and their 3D representations enabling to visualize dynamically the embryo deformation during the Carnegie stages. Through a simplified ontology, containing representative entities which are linked to spatial position and temporal process information, we illustrate the added-value of such a declarative approach for interactive simulation and visualization of 3D embryos.Conclusions: Combining ontologies and 3D models enables a declarative description of different embryological models that capture the complexity of human developmental anatomy. Visualizing embryos with 3D geometric models and their animated deformations perhaps paves the way towards some kind of hypothesis-driven application. These can also be used to assist the learning process of this complex knowledge.Availability: http://www.mycorporisfabrica.org

    Analyse directionnelle du champ acoustique à l'Opéra de Helsinki

    No full text
    La captation d’empreinte acoustique est essentielle tant pour la conservation de l'hĂ©ritage culturel que pour la reproduction en rĂ©alitĂ© virtuelle et rĂ©alitĂ© augmentĂ©e. Les empreintes acoustiques permettent Ă©galement l’analyse de diverses propriĂ©tĂ©s d’un champ acoustique. De surcroit, la captation de rĂ©ponses impulsionnelles Ă  l’aide d’un rĂ©seau de microphones ambisoniques permet Ă©galement une analyse dans le domaine des harmoniques sphĂ©riques ainsi qu'Ă©ventuellement une analyse des propriĂ©tĂ©s directionnelles du champ acoustique. L’importance des premiĂšres rĂ©flexions sur la perception est bien connue ; nĂ©anmoins, la rĂ©verbĂ©ration tardive est habituellement considĂ©rĂ©e comme diffuse dans l’espace. Des travaux de recherche ont rĂ©cemment dĂ©montrĂ© l’importance de considĂ©rer la position dans l’espace de mĂȘme que l’orientation du champ sonore pour la reproduction de rĂ©verbĂ©ration tardive, qui peut Ă  la fois ĂȘtre anisotrope et inhomogĂšne. En ce sens, une Ă©tude approfondie a Ă©tĂ© rĂ©alisĂ©e dans l'auditorium principal de l’OpĂ©ra National de Finlande Ă  Helsinki, oĂč des empreintes acoustiques ambisoniques ont Ă©tĂ© captĂ©es Ă  partir d’une centaine de positions diffĂ©rentes distribuĂ©es dans la salle. Dans cet article, la base de donnĂ©es captĂ©e est analysĂ©e pour y extraire les caractĂ©ristiques principales du champ sonore anisotrope et inhomogĂšne. Des outils d’analyse objectifs sont prĂ©sentĂ©s afin d'extraire les caractĂ©ristiques de ce type d’espace qui cumulent non seulement des propriĂ©tĂ©s de rĂ©verbĂ©ration d’espaces couplĂ©s, mais Ă©galement une distribution directionnelle non uniforme de l’énergie au cours de la rĂ©verbĂ©ration tardive

    Simultaneous Estimation of Gaze Direction and Visual Focus of Attention for Multi-Person-to-Robot Interaction

    Get PDF
    International audienceWe address the problem of estimating the visual focus of attention (VFOA), e.g. who is looking at whom? This is of particular interest in human-robot interactive scenarios, e.g. when the task requires to identify targets of interest over time. The paper makes the following contributions. We propose a Bayesian temporal model that connects VFOA to gaze direction and to head pose. Model inference is then cast into a switching Kalman filter formulation, which makes it tractable. The model parameters are estimated via training based on manual annotations. The method is tested and benchmarked using a publicly available dataset. We show that both the gaze and the VFOA of several persons can be reliably and simultaneously estimated over time from observed head poses as well as from people and object locations. On average , our method compares favorably with two other methods

    Assessing the anisotropic features of spatial impulse responses

    Get PDF
    The direction-dependent characteristics of late reverberation have long been assumed to be perceptually isotropic, meaning that the energy of the decay should be perceived equal from every direction. This assumption has been carried into the way reverberation has been approached for spatial sound reproduction. Now that new methods exist to capture the sound ïŹeld, we need to revisit the way we analyze and render the decaying sound ïŹeld and more speciïŹcally, establish the perceptual threshold of direction dependent characteristics of late reverberation. Towards this goal, this paper proposes the Energy Decay Deviation (EDD) as an objective measure of the directional decay. Based on the deviation of direction-dependent Energy Decay Curves (EDC) to a mean EDC, the EDD aims to highlight the direction-dependent features characterizing the decay. This paper presents the design considerations of the EDD, discusses its limitations, and shows practical examples of its use

    Extended Gaze Following: Detecting Objects in Videos Beyond the Camera Field of View

    Get PDF
    International audienceIn this paper we address the problems of detecting objects of interest in a video and of estimating their locations, solely from the gaze directions of people present in the video. Objects can be indistinctly located inside or outside the camera field of view. We refer to this problem as extended gaze following. The contributions of the paper are the followings. First, we propose a novel spatial representation of the gaze directions adopting a top-view perspective. Second, we develop several convolutional encoder/decoder networks to predict object locations and compare them with heuristics and with classical learning-based approaches. Third, in order to train the proposed models, we generate a very large number of synthetic scenarios employing a probabilistic formulation. Finally, our methodology is empirically validated using a publicly available dataset

    Deep Reinforcement Learning for Audio-Visual Gaze Control

    Get PDF
    International audienceWe address the problem of audiovisual gaze control in the specific context of human-robot interaction, namely how controlled robot motions are combined with visual and acoustic observations in order to direct the robot head towards targets of interest. The paper has the following contributions: (i) a novel audiovisual fusion framework that is well suited for controlling the gaze of a robotic head; (ii) a reinforcement learning (RL) formulation for the gaze control problem, using a reward function based on the available temporal sequence of camera and microphone observations; and (iii) several deep architectures that allow to experiment with early and late fusion of audio and visual data. We introduce a simulated environment that enables us to learn the proposed deep RL model without the need of spending hours of tedious interaction. By thoroughly experimenting on a publicly available dataset and on a real robot, we provide empirical evidence that our method achieves state-of-the-art performance

    Kleene Algebra to Compute Invariant Sets of Dynamical Systems

    No full text
    International audienceIn this paper, we show that a basic fixed point method used to enclose the greatest fixed point in a Kleene algebra will allow us to compute inner and outer approximations of invariant-based sets for continuous-time nonlinear dynamical systems. Our contribution is to provide the definitions and theorems that will allow us to make the link between the theory of invariant sets and the Kleene algebra. This link has never be done before and will allow us to compute rigorously sets that can be defined as a combination of positive invariant sets. Some illustrating examples show the nice properties of the approach

    Comparative genomic and expression analysis of the adenosine signaling pathway members in Xenopus

    No full text
    International audienceAdenosine is an endogenous molecule that regulates many physiological processes via the activation of four specific G-protein-coupled ADORA receptors. Extracellular adenosine may originate either from the hydrolysis of released ATP by the ectonucleotidases or from cellular exit via the equilibrative nucleoside transporters (SLC29A). Adenosine extracellular concentration is also regulated by its successive hydrolysis into uric acid by membrane-bound enzymes or by cell influx via the concentrative nucleoside transporters (SLC28A). All of these members constitute the adenosine signaling pathway and regulate adenosine functions. Although the roles of this pathway are quite well understood in adults, little is known regarding its functions during vertebrate embryogenesis. We have used Xenopus laevis as a model system to provide a comparative expression map of the different members of this pathway during vertebrate development. We report the characterization of the different enzymes, receptors, and nucleoside transporters in both X. laevis and X. tropicalis, and we demonstrate by phylogenetic analyses the high level of conservation of these members between amphibians and mammals. A thorough expression analysis of these members during development and in the adult frog reveals that each member displays distinct specific expression patterns. These data suggest potentially different developmental roles for these proteins and therefore for extracellular adenosine. In addition, we show that adenosine levels during amphibian embryogenesis are very low, confirming that they must be tightly controlled for normal development
    corecore